Method and apparatus for maintaining consistent per-hop forwarding behavior in a network using network-wide per-hop behavior definitions

ABSTRACT

A method, apparatus, and computer-readable medium configured for maintaining consistent per-hop packet forwarding behavior among a plurality of network devices in a network within a Differentiated Services (DS) domain are disclosed. In one aspect, a method involves creating and storing a network-wide PHB definition that associates a PHB with a DS code point (DSCP) value, and with a set of parameters that define the bandwidth and buffer resources allocated to the PHBs on all interfaces of network devices within the DS domain. A mapping of each of the PHBs in the network-wide PHB definition to one or more queues of the network devices is determined. Drain size and queue size values are determined for each of the queues to which PHBs are mapped. A mapping of each of the PHBs to a threshold value associated with the queues is determined. Parameters of fragmentation and interleave mechanisms are determined. Network device configuration parameter values based on the mappings, the drain size, and the queue size, etc. are sent to each of the network devices within the DS domain. As a result, consistent PHB is achieved throughout a network using abstract definitions of PHBs.

RELATED APPLICATION

This application is a continuation of and claims priority to U.S. patent application Ser. No. 09/753,034, filed on Dec. 28, 2000, entitled “METHOD AND APPARATUS FOR MAINTAINING CONSISTENT PER-HOP FORWARDING BEHAVIOR IN A NETWORK USING NETWORK-WIDE PER-HOP BEHAVIOR DEFINITIONS,” naming as inventors Arthur Zavalkovsky, Nitsan Elfassy, and Ron Cohen, which claims domestic priority from prior Provisional application Ser. No. 60/250,749, filed Dec. 01, 2000, entitled “METHOD AND APPARATUS FOR MAINTAINING CONSISTENT PER-HOP ROUTING BEHAVIOR IN A NETWORK USING NETWORK-WIDE SERVICE DEFINITIONS,” naming as inventors Arthur Zavalkovsky, Nitsan Elfassy, and Ron Cohen, the contents both of which are hereby incorporated by reference in their entirety for all purposes.

FIELD OF THE INVENTION

The present invention relates generally to computer networking. The invention relates more specifically to a method and apparatus for maintaining consistent per-hop packet forwarding behavior in a network using network-wide per-hop behavior definitions.

BACKGROUND OF THE INVENTION

A computer network typically comprises a plurality of interconnected entities that transmit (“source”) or receive (“sink”) data frames. A common type of computer network is a local area network (“LAN”) that generally comprises a privately owned network within a single building or campus. LANs employ a data communication protocol (LAN standard) such as Ethernet, FDDI, or Token Ring, that defines the functions performed by the data link and physical layers of a communications architecture (i.e., a protocol stack), such as the Open Systems Interconnection (OSI) Reference Model. In many instances, multiple LANs may be interconnected by point-to-point links, microwave transceivers, satellite hookups, etc., to form a wide area network (“WAN”), metropolitan area network (“MAN”) or Intranet. These internetworks may be coupled through one or more gateways to the global, packet-switched internetwork generally known as the Internet or World Wide Web (WWW).

Each network entity preferably includes network communication software, which may operate in accordance with Transport Control Protocol/Internet Protocol (TCP/IP). TCP/IP generally consists of a set of rules defining how entities interact with each other. In particular, TCP/IP defines a series of communication layers, including a transport layer and a network layer. At the transport layer, TCP/IP includes both the User Data Protocol (UDP), which is a connectionless transport protocol, and TCP, which is a reliable, connection-oriented transport protocol. When a process at one network entity wishes to communicate with another entity, it formulates one or more messages and passes them to the upper layer of the TCP/IP communication stack. These messages are passed down through each layer of the stack where they are encapsulated into packets and frames. Each layer also adds information in the form of a header to the messages. The frames are then transmitted over the network links as bits. At the destination entity, the bits are re-assembled and passed up the layers of the destination entity's communication stack. At each layer, the corresponding message headers are stripped off, thereby recovering the original message that is handed to the receiving process.

One or more intermediate network devices are often used to couple LANs together and allow the corresponding entities to exchange information. For example, a bridge may be used to provide a “bridging” function between two or more LANs. Alternatively, a switch may be utilized to provide a “switching” function for transferring information, such as data frames or packets, among entities of a computer network. Typically, the switch is a computer having a plurality of ports that couple the switch to several LANs and to other switches. The switching function includes receiving data frames at a source port and transferring them to at least one destination port for receipt by another entity. Switches may operate at various levels of the communication stack. For example, a switch may operate at Layer 2, which in the OSI Reference Model, is called the data link layer, and includes the Logical Link Control (LLC) and Media Access Control (MAC) sub-layers.

Other intermediate devices, commonly known as routers, may operate at higher communication layers, such as Layer 3, which in TCP/IP networks corresponds to the Internet Protocol (IP) layer. Conventionally, IP data packets include a corresponding header that contains an IP source address and an IP destination address. Routers or Layer 3 switches may re-assemble or convert received data frames from one LAN standard (e.g., Ethernet) to another (e.g., Token Ring). Thus, Layer 3 devices are often used to interconnect dissimilar subnetworks. Some Layer 3 intermediate network devices may also examine the transport layer headers of received messages to identify the corresponding TCP or UDP port numbers being utilized by the corresponding network entities. Many applications are assigned specific, fixed TCP and/or UDP port numbers in accordance with Request For Comments (RFC) 1700. For example, TCP/UDP port number 80 corresponds to the Hypertext Transport Protocol (HTTP), while port number 21 corresponds to File Transfer Protocol (FTP) service.

A process executing at a network entity may generate hundreds or thousands of traffic flows that are transmitted across a network. Generally, a traffic flow is a set of messages (frames and/or packets) that typically correspond to a particular task, transaction or operation (e.g., a print transaction) and may be identified by various network and transport parameters, such as source and destination IP addresses, source and destination TCP/UDP port numbers, and transport protocol.

The treatments that are applied to different traffic flows may vary depending on the particular traffic flow at issue. For example, an online trading application may generate stock quote messages, stock transaction messages, transaction status messages, corporate financial information messages, print messages, data backup messages, etc. A network administrator may wish to apply a different policy or service treatment (“quality of service” or “QoS”) to each traffic flow. In particular, the network administrator may want a stock quote message to be given higher priority than a print transaction. Similarly, a $1 million stock transaction message for a premium client should be assigned higher priority than a $100 stock transaction message for a standard customer.

Computer networks include numerous services and resources for use in moving traffic throughout the network. For example, different network links, such as Fast Ethernet, Asynchronous Transfer Mode (ATM) channels, network tunnels, satellite links, etc., offer unique speed and bandwidth capabilities. Additionally, the intermediate devices also include specific resources or services, such as number of priority queues, filter settings, availability of different queue selection strategies, congestion control algorithms, etc.

Individual frames or packets can be marked so that intermediate devices may treat them in a predetermined manner. For example, the Institute of Electrical and Electronics Engineers (IEEE) describes additional information for the MAC header of Data Link Layer frames in Appendix 802.1p to the 802.1D bridge standard.

A Differentiated Services (DS) model is under development by the Internet Differentiated Services Working Group of the Internet Engineering Task Force (IETF). IETF is seeking to promote DS as the “right” approach for implementing scalable service differentiation in large networks. Network devices that implement DS are said to form a DS domain. Such devices may include routers and switches.

The DS model comprises a number of functional elements that are implemented in network nodes. These functional elements include a small set of per-hop forwarding behaviors (PHBs) that routers or switches may apply to packets, packet classification functions, and traffic conditioning functions, including metering, marking, shaping and policing functions. The DS architecture achieves scalability by implementing complex classification and conditioning functions only at network nodes located at network boundaries, and by applying specified per-hop behavior to aggregates of traffic which have been appropriately marked using the DS field in the IP version 4 (“IPv4”) or IPv6 headers. The PHBs are defined to permit a reasonably granular means of allocating buffer and bandwidth resources at each node among competing traffic streams.

Information about DS may be found in: D. Grossman, “New Terminology for Diffserv,” draft-ietf-diffserv-new-terms-02.txt, November, 1999; V. Jacobson et al., “An Expedited Forwarding PHB,” RFC 2598, June 1998; F. Baker et al., “Management Information Base for the Differentiated Services Architecture,” draft-ietf-diffserv-mib-02.txt, March, 2000; J. Heinanen et al., “Assured Forwarding PHB Group,” RFC 2597, June, 1999; S. Blake et al., “An Architecture for Differentiated Services,” RFC 2475, December, 1998; S. Brim et al., “Per Hop Behavior Identification Codes,” RFC 2836, May 2000; “Differential Service for the Internet,” available at the “diffserv.lcs.mit.edu” Web site; J. Wroclawski, “Applications, Flexibility, and Differential Services,” May, 1998, available in the Web site document “internet2.edu/qos/may98Workshop/presentations/Wroclawski/sld001.htm”; A. Campbell, “Differential Services: Services and Mechanisms, Lecture 11,” Jan. 13, 2000, available in the Web site document “comet.ctr.columbia.edu/distributed/lectures/lecture11/sld001.htm.”

In general, the network administrator is responsible for defining the QoS provided within a network. A network administrator cannot be expected to understand all of the details that are required to modify the schema for an appropriate per-hop-behavior configuration that is effective across the network. In the current DS model, there is no simple way for an administrator to specify the relationship and interdependencies between the different PHBs that are in effect in the DS domain. In particular, there is no simple way to allocate network resources to each PHB in a device independent way.

Based on the foregoing, there is a clear need for a mechanism that can enable a network administrator to define a network-wide specification of QoS PHBs deployed within a DS domain.

There is also a need for a mechanism will allow a network administrator to define the relationship and interdependencies between the different per-hop-behaviors within a DS domain.

There is also a need for a mechanism for providing consistent per-hop-behavior device configuration derived from the network-wide PHB specification thus ensuring end-to-end enforcement of the QoS PHBs.

The current DS approach does allow definition of relationships between the different PHBs within an AF PHB group, but it does not provide an easy way to define the relationship between resources allocated to the entire PHBs enforced on the domain. In particular, it does not provide a way to specify the relationship between the AF PHB groups and the relationship to the EF and BE PHBs or any user-defined PHB. In particular, there is a need for a way to guarantee minimal bandwidth and buffer resources to each PHB as well as to limit a PHB from exploiting more than its share in these resources.

There is also a need for a way to automatically generate device configuration parameters for the PHBs so that device behavior matches the definition of the PHB.

SUMMARY OF THE INVENTION

The foregoing needs, and other needs that will become apparent from the following description, are achieved in the present invention, which comprises, in certain embodiments, a method, apparatus, and computer-readable medium configured for maintaining consistent per-hop packet forwarding behavior among a plurality of network devices in a network within a Differentiated Services (DS) domain are disclosed.

In one aspect, a method involves creating and storing a network-wide service definition that associates a service with a DS code point (DSCP) value, and with a set of parameters that define the bandwidth and buffer resources allocated to the PHBs on all interfaces of network devices within the DS domain. A mapping of each of the PHBs in the network-wide PHB definition to one or more queues of the network devices is determined. Drain size and queue size values are determined for each of the queues to which PHBs are mapped. A mapping of each of the PHBs to a threshold value associated with the queues is determined. Parameters for fragmentation and interleave mechanisms are determined. Network device configuration parameter values based on the mappings, the drain size, and the queue size, etc. are sent to each of the network devices within the DS domain. As a result, consistent PHB is achieved throughout a network using abstract definitions of PHBs.

In one embodiment, a network-wide PHB table is defined, containing values of parameters that define properties of PHBs. These properties are mapped to detailed per-hop-behavior parameters.

Alternatively, the invention provides a schema that represents a set of quality of service PHBs, and a translation mechanism for converting representations of PHBs into parameters that specify the required per-hop-behaviors for supporting the PHBs in network devices such as routers and switches. The translation mechanism ensures that each device applies the proper treatment to packets that are associated with a specified PHB. The translation mechanism may involve creating and storing queue scheduling mechanisms, configuring the scheduling mechanisms on each device, etc.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:

FIG. 1A is a simplified block diagram that conceptually illustrates aspects of an embodiment.

FIG. 1B is a simplified block diagram providing an alternative conceptual illustration of aspects of an embodiment.

FIG. 1C is a block diagram of a computer network in which certain embodiments of the invention may be used.

FIG. 2 is an example of a PHB set schema that may be used to define a PHB set for a DS domain.

FIG. 3 is a block diagram of a process of determining configuration parameters for services.

FIG. 4 is a block diagram of a computer system with which an embodiment may be carried out.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

A method and apparatus for maintaining consistent per-hop packet forwarding behavior in a network using network-wide PHB definitions. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the present invention.

Conceptual Overview

Within a DS Domain, the network administrator needs to have the ability to choose the set of PHBs that the administrator intends to use and to modify or “tune” the different parameters for each PHB to achieve desired performance in the network. Because all PHBs defined within a DS domain are required to share the domain's bandwidth and buffer resources for each hop, tuning should be made on the PHB set selected for the domain (that is, all PHBs defined for the domain), and not individually per each PHB.

In one embodiment, an abstract representation of a set of PHBs that includes a set of tuning parameters is used to define a network-wide service definition for maintaining consistent per-hop-behaviors within a DS domain. In certain embodiments, the tuning parameters are defined such that they do not depend on the details of the individual QoS mechanisms that are implemented on each network device.

FIG. 1A is a simplified block diagram that conceptually illustrates aspects of an embodiment. A network-wide PHB table 2 is defined, containing values of parameters that define properties of PHBs. As indicated by block 4, the properties are mapped to detailed per-hop-behavior parameters 6. Service table 2 provides a way to describe relationships among PHB but not the means for configuring each device, which is carried out during the mapping.

FIG. 1B is a simplified block diagram providing an alternative conceptual illustration of aspects of an embodiment. In this embodiment the invention provides a schema 8 that represents a set of quality of service PHBs 10, and a translation mechanism 14 for converting representations of PHBs into parameters 12 that specify the required per-hop-behaviors for supporting the PHBs in network devices such as routers and switches.

Operational Context

FIG. 1C is a block diagram of a computer network 100 in which certain embodiments of the invention may be used. Generally, computer network 100 includes one or more network devices 120, 122, 124, 126, a plurality of workstations 116, 118, a policy management station 102, and a network 128.

Network devices 120, 122 represent edge network devices such as routers, switches, or other similar or equivalent devices that are configured for coloring packets within network 128. In one embodiment, network devices 120, 122 are configured to execute the Cisco Internetworking Operating System (IOS) and are capable of marking packets with DSCP values, i.e., they are compatible with Differentiated Services. Such marking may be carried out using a marker or other software element or application that runs under control of IOS, e.g., an agent or process.

Network devices 124, 126 represent internal network devices (“core devices”) such as routers, switches, or other similar or equivalent devices that are configured for forwarding packets within network 128 based the color of each packet. In certain embodiments, network devices 124, 126 are configured to execute the IOS and are compatible with Differentiated Services. Network devices 120, 122 and network devices 124, 126 may in fact represent similar or even identical device types and/or models that are each configured to perform a designated function within computer network 100.

Workstations 116, 118 may be personal computers, workstations, or other network end stations at which work is done, such as printers, scanners, facsimile machines, etc. In certain embodiments, workstations 116, 118 are network devices, such as bridges, gateways, routers or switches that allow computer network 100 to connect to another network system. For example, workstation 116 may be an edge device that is configured for coloring packet of a different DS domain. In certain embodiments, workstations 116, 118 execute one or more applications 112, 114. Applications 112, 114 may represent a variety of different computer applications that execute on workstations 116, 118 respectively and which cause data to be sent and received over network 128.

Network 128 comprises any number of network devices. Network 128 may form part of a LAN or WAN. In one embodiment, network 128 is a packet-switched IP network configured as a DS domain whereby treatment of packets that flow through network 128 is controlled and managed by Policy Management Station 102 and network devices 120, 122, 124, 126. In this example, the DS domain includes network devices 120, 122, 124, 126.

Policy Management Station 102 is a computer, or a group of hardware or software components or processes that cooperate or execute in one or more computer systems. In this example, Policy Management Station 102 includes a policy coordinator 104 and one or more policy servers 106, 108, 110, that are coupled to network devices 120, 122, 124, 126. In one embodiment, policy coordinator 104 communicates with policy servers 106, 108, 110 to configure the network devices 120, 122, 124, 126, to control the coloring and forwarding of packets within network 128. For example, to implement a particular per-hop behavior, policy coordinator 104 may direct network devices 120, 122 to color the packets of all Voice Over IP (VOIP) flows with the color gold (high priority) and to color the packets of all File Transfer Protocol (FTP) flows with the color Bronze (low priority). Each color corresponds to a particular PHB and is associated with one or more QoS treatment parameters, e.g., a pre-defined DSCP value and possibly other values or characteristics. Policy coordinator 104 may further direct network devices 124, 126 to apply a particular forwarding policy based on the particular color of each packet that is processed.

In one embodiment, Policy Management Station 102 provides a mechanism whereby a network administrator may easily create and modify (“define” and “tune”) a domain PHB set to coordinate the application of multiple PHBs within a DS domain. An examples of a mechanism that may be used by a network administrator to define and tune a domain PHB set is described in detail below. Once the domain PHB set is defined, policy coordinator 104 communicates with the policy servers 206, 208, 210 to cause edge devices 120, 122 and core devices 224, 226 to implement the domain PHB set definition.

An example of a commercial product suitable for use as Policy Management Station 102 is CiscoAssure QoS Policy Manager 1.0, commercially available from Cisco Systems, Inc.

Although the example embodiment of FIG. 1C shows two (2) workstations 116, 118, three (3) policy servers 116, 108, 110, two (2) edge devices 120, 122, and two (2) core devices 124, 126, in other practical embodiments there may be any number of such elements. In addition, Policy Management Station 102 is provided as only an example of one mechanism that may be used to define a domain PHB set. Policy Management Station 102 may be configured as a single component or a plurality of different distributed components that are configured to define a domain PHB set for implementing QoS policies within a network. In addition, in certain embodiments, policy servers 106, 110 are coupled to network 128 and may communicate with edge devices 120 and 122 over network 128.

Defining a Domain PHB Set

FIG. 2 illustrates an example of domain PHB set schema 200 that may be used to define a domain PHB set. In one embodiment, policy coordinator 104 is configured to provide a user interface to allow a user, such as a network administrator, to define a domain PHB set for a particular DS domain, such as domain PHB set schema 200.

In this example, schema 200 is configured as a table having data columns 202, 204, 206, 208 that include specific parameter values that define a particular PHB. Thus, each PHB comprises a plurality of parameter values ,and a domain PHB set comprises a set of PHBs that apply to a particular DS domain. For example, column 202 represents a Best Effort (BE) PHB definition, columns 204, 206 respectively represent Assured Forwarding (AF11, AF12) PHBs definitions and column 208 represents an Expedited Forwarding (EF) PHB definition.

One or more Domain PHB tuning parameter values are used to tune the set of PHBs that are contained within a DS domain. In the example of FIG. 2, the Domain PHB tuning parameters for schema 200 include a Name field 210, DSCP field 212, a Forwarding Class field 214, a Default Class field 216, an Immediate Forwarding field 218, a Reserved Bandwidth field 220, a Forwarding Factor field 222, a Reserved Packets field 224, a Buffer Factor field 226, a Packet Size field 228, a Traffic Type field 230 and a Per Hop Delay field 232.

The value of Name field 210 identifies the name for the PHB. Examples include EF, Mission Critical, AF12, etc.

The value of DSCP field 212 identifies the distinct DSCP value that is associated with the PHB.

The Forwarding Class field 214 identifies all PHBs that belong to the same forwarding class. A network node must preserve the order of packets for all PHBs having the same forwarding class. In this example, AF11 and AF12 are defined as belonging to the same forwarding class, namely class “1”.

The Default Class field 216 is a Boolean flag that is set on one and only one PHB entry. When set, the flag value specifies that a particular PHB is the default PHB that is provided to flows having an unspecified PHB selector. In one embodiment, the flag is set by default for the Best Effort PHB.

The Immediate Forwarding field 218 is a Boolean flag that specifies whether immediate forwarding of packets belonging to the particular PHB is required. In this example, the flag is set in the EF PHB.

The value of Reserved Bandwidth field 220 specifies the bandwidth in Kb/sec reserved for the behavior aggregate. For PHBs that do not require immediate forwarding, the reserved bandwidth value determines the minimum bandwidth that is reserved for this service. In immediate forwarding PHBs, the reserved bandwidth indicates the maximum bandwidth allowed for the behavior aggregate. Such a limit on bandwidth is provided in order to ensure that other behavior aggregates have sufficient bandwidth. In general, all PHBs with the same forwarding class share the same reserved bandwidth value. For example, because AF11 and AF12 share the same forwarding class, they also share the same reserved bandwidth value of 512 Kb/sec, as indicated by the value in FIG. 2 at the intersection of Reserved Bandwidth parameter 220 with column 204.

The value of Forwarding Factor field 222 defines the percentage of the scheduling resources that may be consumed by the behavior aggregate. This is an alternative representation of the reserved bandwidth field. It allows the network administrator to manage its scheduling resources without knowledge of the particular link speeds. Within a domain PHB set, an administrator may define a mixed assignment of forwarding factors to some forwarding classes and reserved bandwidth to others classes.

The value of Reserved Packets field 224 defines the number of packets that will be reserved in a queue prior to discard. Large values allow for sustained bursts. Within a forwarding class, this parameter specifies the relative drop precedence of PHBs. For example, within the AF PHB group, the total number of packets reserved for A1 x is assigned to A11, while smaller numbers of packets are reserved for the A12 and A13 PHBs respectively.

The value of Buffer Factor field 226 defines a percentage of the buffer resources that are allocated for the behavior aggregate. This is an alternative representation of the Reserved Packets field 224. It allows the network administrator to manage its buffer resources without knowledge of the particular queue lengths. In order to translate the buffer factor field into an Actual Reserved Packet field, a global parameter specifying the total buffer space should be used. The sum of buffer factors assigned to the forwarding classes is 100%. Within each forwarding class, the buffer factor specifies the relative drop precedence of the PHBs.

Packet Size field 230 is a parameter that describes the average or typical packet size in bytes of traffic of a behavior aggregate. This parameter is used when there is a need to arrive at a byte count representation of reserved packets field. For some schedulers a drain-size parameter is required per queue. Drain size determines the number of bytes a scheduler fetches from each queue in each cycle. The ratio between drain-sizes is determined by either the ratio between the reserved bandwidth field 220 or the ratio between the forwarding factors. Drain-size values should approximate a multiple of a typical packet size and not be smaller than one packet size of the PHB the queue serves.

The Traffic type field 230 includes a flag that specifies whether the traffic of the behavior aggregate is elastic or not. Most TCP traffic is elastic and can adapt to the available network resources, in the sense that the TCP protocol can counteract drops by ensuring that dropped packets are re-sent. RED discard mechanism is useful for elastic PHBs. An example for non-elastic traffic is UDP traffic carrying Voice. For this type of traffic, RED does not provide any benefit.

Maximum Per Hop Delay field 232 is a parameter that describes the maximum delay, e.g., in milliseconds, before a packet having a specified behavior aggregate is forwarded. This parameter is primarily relevant to PHBs that provide immediate packet forwarding. On slow speed interfaces, this parameter allows to calculate whether fragmentation and interleave of packets is required. Examples of such mechanisms include MPPP or Frame-Relay FRF-12 LFI. The fragment size (fs) is determined according to the relation: fs=max-phop-delay/link-speed If fs is larger than link MTU, then fragmentation and interleave is not required.

Mapping a PHB Set to Device Configuration

The foregoing PHB set schema may be used to automatically create and store a set of specific device configuration parameter values. In one embodiment, a PHB set table may be transformed into another information representation that is useful in device configuration, such as a MIB, PIB, or actual configuration commands for a device.

The IETF DS working group has proposed several alternatives to model PHB configuration on devices, as set forth in F. Baker et al., “Management Information Base for the Differentiated Services Architecture,” draft-ietf-diffserv-mib-02.txt, March, 2000, and M. Fine et al., “Quality of Service Policy Information Base,” draft-mfine-cops-pib-01.txt, Sep. 25, 1999. The configuration of device queues and actions required by the models proposed in these papers can be determined using the PHB schema described herein.

It is also possible to create queue configurations in the models proposed in the foregoing papers that cannot be derived from a PHB set that is defined according to the schema described herein. However, the PHB set schema described herein is believed to adequately model viable PHBs in a simple and straightforward manner. For example, some queue models allow for multiple scheduling priority levels. The need for more than two levels of priority is not known to lead to meaningful PHBs; therefore, the present schema provides only one priority level, denoted by the intermediate forwarding flag.

The following description provides an example of how queues and actions defined in the paper of Baker et al. can be configured. Detailed explanations of each of the resulting fields are found in Baker et al. Queues are modeled as a set of independent FIFO queues, each having the parameters shown in Table 1. TABLE 1 QUEUE PARAMETERS DiffServQueueEntry ::= SEQUENCE { diffServQueueNumber    INTEGER, diffServQueuemMinimumRate Unsigned32, diffServQueueMaximumRate Unsigned32, diffServQueuePriority Unsigned32, diffServQueueNextTCB RowPointer, diffServQueueStatus RowStatus }

The example PHB set defined above can be mapped to the foregoing values in a stored table having the values shown in Table 2. TABLE 2 EXAMPLE PHB SET QUEUE MAPPING NAME BE AF EF diffServQueueNumber 1 2 3 diffServQueuemMinimumRate 64 512 — diffServQueueMaximumRate — — 512 diffServQueuePriority 0 0 1 diffServQueueNextTCB NULL NULL NULL

In order to provide full PHB configuration within the model introduced in Baker et al., the action entry configuration is also specified. Actions are modeled using the values set forth in Table 3. TABLE 3 ACTION ENTRY PARAMETERS DiffServActionEntry ::= SEQUENCE {   diffServActionNumber INTEGER,   diffServActionNext RowPointer,   diffServActionDSCP Dscp,   diffServActionMinThreshold Unsigned32,   diffServActionMaxThreshold Unsigned32,   diffServActionDropPolicy INTEGER,   diffServActionStatus RowStatus }

The example PHB set defined above can be mapped to the model of Table 3 in a stored table having the values shown in Table 4. TABLE 4 EXAMPLE PHB SET ACTION MAPPING diffServActionNumber 1 2 3 4 diffServActionNext queue1 queue2 queue3 queue4 diffServActionDSCP BE AF11 AF12 EF diffServActionMinThreshold 2pckts 4pckts 2pckts 2pckts diffServActionMaxThreshold 6pckts 12pckts 6pckts 2pckts diffServActionDropPolicy random random random tail

The foregoing mappings assume that the total-packet-in-queue parameter is set to 20 packets.

The buffer factor percentages determine maximum queue threshold values. The minimum threshold values are not determined by the PHB set specification. When an active queue management scheme such as Random Early Detection (RED) is used, RED minimal thresholds may be determined by adjusting to network behavior. For RED thresholds, the minimum threshold value is set to be ⅓ of the maximum threshold value. Fragmentation and interleave are not modeled in the paper of Baker et al. and therefore are not included in the foregoing mapping. Determining threshold values provides another way of giving values to the relationship between PHBs within a class.

Some links may not be able to admit the reserved bandwidth required for each of the forwarding classes. Therefore, the provisioning system that is used should detect these links and report back an admission error.

Some devices may not be able to implement the PHB set defined herein, either because not enough queues are available, or because the scheduler cannot provide preemptive scheduling. The provisioning system that is used should detect such deficient devices and report back an error. In order to provide the best approximate PHB configuration that can be implemented on a deficient device, additional information should be specified in the PHB set. For example, if only two queues are available on the device, the PHB set should specify whether AF and BE should be mapped to the same queue, or whether EF and AF should.

As a result, PHBs are mapped to specific device configuration parameter values, which may be configured in one or more devices using conventional means, e.g., by messages that are sent to the devices from a policy decision point.

General Rules for Mapping Network Wide PHB Settings to Device Configuration

According to an embodiment, functions are provided for determining the PHB configuration parameters that can be provided to a router or switch to implement the functions described herein. FIG. 3 is a flow diagram of an example method of determining PHB configuration parameters. The process of FIG. 3 may be used, for example, to translate a PHB set table to values in a MIB, values in a PIB, or to another abstract representation of a device configuration.

In block 302, PHBs are mapped to queues. In an embodiment, the mapping is not derived directly from the PHB table; rather, the following rules are observed.

-   -   1. All PHBs within the same forwarding class are mapped to the         same queue.     -   2. The maximal number of queues required is equal to the number         of forwarding classes in the PHB table.     -   3. PHBs with immediate forwarding flag set should be mapped to a         queue served by a preemptive scheduler.

4. If not enough queues are available, or if not enough priority queues are available, the network administrator must be notified. If required, the network administrator should specify which PHBs should be mapped to the same queue. Table 5 is an example of a mapping of PHBs to queues for the PHB definition provided in figure 2. TABLE 5 EXAMPLE SERVICE TO QUEUE MAPPING Queue Type PHB Queue Number 1q Best Effort 1 AF1x 1 EF 1 1q1p Best Effort 1 AF1x 1 EF  1p 2q Best Effort 1 AF1x 1 EF 2 2q1p Best Effort 1 AF1x 2 EF  1p 3q Best Effort 1 AF1x 2 EF 3

Table 5 provides an example of mappings of PHBs to queues. The queue type column denotes the number of ordinary queues and the number of priority queues. The automatic rules map the PHB specification of FIG. 2 into a queue type 2q1p. The maximal number of queues required is 3. Mapping to queue types 1q1p and 3q best approximate the requirements of the PHB set of FIG. 2, but the network administrator must be notified that full PHB set cannot be enforced. Mapping of the PHBs to the 2q queue type should be made by the network administrator, as there is no way to determine whether AF should share a queue with either BE or EF.

In block 304, drain sizes and queue sizes are calculated and stored in a weights table. A queue's drain size is the maximum number of bytes that may be drained from the queue in one cycle. Therefore drain size determine the relative bandwidth allocation for each queue. Some devices can be configured directly with the maximal or minimal rate each queue should service. Transformation between representation of drain size in bytes and other representation are straightforward. A queue size is the maximal number of bytes the queue can hold. Some devices set queue size in terms of packets. Queue size in packets can be calculated by dividing the queue size in bytes with the largest packet size of all PHBs mapped to that queue.

In an embodiment, this table is created based on the PHB table, from the mapping of PHBs to queues and from values of an additional parameter, the Total Queue size. The total queue size, measured in packets, specify the total queue length on each of the interfaces.

In one embodiment, drain sizes and queue sizes are created as follows:

-   -   1. The queue size of a given queue is proportional to the sum of         buffer factors of all PHBs that were mapped to this queue in         block 302. The sum is a percentage of the total queue size.         Alternatively, the queue size is proportional to the sum of         reserved packets of all PHBs that were mapped to this queue in         block 302. Using the largest packet size of all PHBs mapped to         this queue one can specify the queue size in bytes.     -   2. The drain size of a given queue is proportional to the sum of         forwarding factor of all services mapped to this queue in         step 1. For each queue type, the queue with the smallest         associated forwarding factor is assigned with a drain size of         one packet size. All other queues in this queue size get a         proportional drain size, according to their forwarding factors.

An example weights table for the Service Template shown in FIG. 2 is set forth in Table 6. For purposes of Table 6 a Total Queue Size of 60 Packets is assumed. TABLE 6 EXAMPLE WEIGHTS TABLE Drain Size Queue Size Queue Type Queue Number (bytes) (bytes) 1q 1 1500 90000 2q 1 225 81000 2 200 1200 1q1p 1 225 81000  1p 200 1200 2q1p 1 1500 27000 2 12000 54000  1p 12000 1200 3q 1 1500 27000 2 12000 54000 3 12000 1200

In block 306 the threshold positions within each queue are determined. If enough queues are available within the device, each forwarding class is mapped to a different queue. Each of the PHBs within a PHB group (i.e., PHBs that have the same forwarding class) should be mapped to a different threshold. The threshold position within this queue should be proportional to the buffer factors of each of the service, or alternatively should equal the reserved packets of each of the PHBs. For example, when the PHB set of FIG. 2 is mapped to a 2q1p queue type, both AF11 and AF12 are mapped to queue 2. Queue 2 should have a single threshold positioned at 50% of the queue length. This threshold would than enforce the buffer factor relation between AF11 and AF12. If not enough thresholds are available to differentiate between the different PHB within a forwarding class, or if the threshold positions can not be set appropriately, the network administrator should be notified, and the closest approximation should be taken.

When not enough queues are available, more than one forwarding class is mapped to a single queue. In this case, different PHBs can be mapped to different thresholds according to their different buffer factor or reserved packet fields. For example, Table 5 indicates that both AF1 and BE are mapped to the same queue in 2q queue types. Both AF12 and BE of FIG. 2 are configured with the same buffer factor therefore can use the same threshold. Since AF11 can utilize the whole queue, the threshold position should than be set to 50% in this example.

In block 308 the parameters for fragmentation and interleave mechanisms are determined. Fragmentation and interleave mechanisms are required only if a PHB with a immediate forwarding flag is set, and only if the link MTU is smaller than the fragment size (fs) determined according to the relation: fs=max-phop-delay/link-speed If fs is smaller than link MTU, then all packet that belong to non immediate forwarding PHBs should be fragmented to a size not bigger than fs.

Hardware Overview

FIG. 4 is a block diagram that illustrates a computer system 500 upon which an embodiment of the invention may be implemented. The preferred embodiment is implemented using one or more computer programs running on a network element such as a router device. Thus, in this embodiment, the computer system 500 is a router.

Computer system 500 includes a bus 502 or other communication mechanism for communicating information, and a processor 504 coupled with bus 502 for processing information. Computer system 500 also includes a main memory 506, such as a random access memory (RAM), flash memory, or other dynamic storage device, coupled to bus 502 for storing information and instructions to be executed by processor 504. Main memory 506 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 504. Computer system 500 further includes a read only memory (ROM) 508 or other static storage device coupled to bus 502 for storing static information and instructions for processor 504. A storage device 510, such as a magnetic disk, flash memory or optical disk, is provided and coupled to bus 502 for storing information and instructions.

A communication interface 518 may be coupled to bus 502 for communicating information and command selections to processor 504. Interface 518 is a conventional serial interface such as an RS-232 or RS-422 interface. An external terminal 512 or other computer system connects to the computer system 500 and provides commands to it using the interface 514. Firmware or software running in the computer system 500 provides a terminal interface or character-based command interface so that external commands can be given to the computer system.

A switching system 516 is coupled to bus 502 and has an input interface 514 and an output interface 519 to one or more external network elements. The external network elements may include a local network 522 coupled to one or more hosts 524, or a global network such as Internet 528 having one or more servers 530. The switching system 516 switches information traffic arriving on input interface 514 to output interface 519 according to pre-determined protocols and conventions that are well known. For example, switching system 516, in cooperation with processor 504, can determine a destination of a packet of data arriving on input interface 514 and send it to the correct destination using output interface 519. The destinations may include host 524, server 530, other end stations, or other routing and switching devices in local network 522 or Internet 528.

The invention is related to the use of computer system 500 for the techniques and functions described herein in a network system. According to one embodiment of the invention, such techniques and functions are provided by computer system 500 in response to processor 504 executing one or more sequences of one or more instructions contained in main memory 506. Such instructions may be read into main memory 506 from another computer-readable medium, such as storage device 510. Execution of the sequences of instructions contained in main memory 506 causes processor 504 to perform the process steps described herein. One or more processors in a multi-processing arrangement may also be employed to execute the sequences of instructions contained in main memory 506. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement the invention. Thus, embodiments of the invention are not limited to any specific combination of hardware circuitry and software.

The term “computer-readable medium” as used herein refers to any medium that participates in providing instructions to processor 504 for execution. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 510. Volatile media includes dynamic memory, such as main memory 506. Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 502. Transmission media can also take the form of acoustic or light waves, such as those generated during radio wave and infrared data communications.

Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave as described hereinafter, or any other medium from which a computer can read.

Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to processor 504 for execution. For example, the instructions may initially be carried on a magnetic disk of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 500 can receive the data on the telephone line and use an infrared transmitter to convert the data to an infrared signal. An infrared detector coupled to bus 502 can receive the data carried in the infrared signal and place the data on bus 502. Bus 502 carries the data to main memory 506, from which processor 504 retrieves and executes the instructions. The instructions received by main memory 506 may optionally be stored on storage device 510 either before or after execution by processor 504.

Communication interface 518 also provides a two-way data communication coupling to a network link 520 that is connected to a local network 522. For example, communication interface 518 may be an integrated services digital network (ISDN) card or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface 518 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, communication interface 518 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.

Network link 520 typically provides data communication through one or more networks to other data devices. For example, network link 520 may provide a connection through local network 522 to a host computer 524 or to data equipment operated by an Internet Service Provider (ISP) 526. ISP 526 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the “Internet” 528. Local network 522 and Internet 528 both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 520 and through communication interface 518, which carry the digital data to and from computer system 500, are exemplary forms of carrier waves transporting the information.

Computer system 500 can send messages and receive data, including program code, through the network(s), network link 520 and communication interface 518. In the Internet example, a server 530 might transmit a requested code for an application program through Internet 528, ISP 526, local network 522 and communication interface 518. In accordance with the invention, one such downloaded application provides for the techniques and functions that are described herein.

The received code may be executed by processor 504 as it is received, and/or stored in storage device 510, or other non-volatile storage for later execution. In this manner, computer system 500 may obtain application code in the form of a carrier wave.

Alternatives, Extensions

The system as described herein provides for a network-wide PHB definition for maintaining consistent per-hop-behaviors within a network. By providing a schema that allows a user to define a relationship between the different PHBs that are associated with a DS domain, a mechanism is provided whereby a network administrator can tune the bandwidth and buffer resources allocated to each PHB within the DS domain. Moreover, the tuning of the bandwidth and buffer resources allocated to each PHB may be performed without regard to the particular configurations of each device.

In describing certain embodiments of the invention, several drawing figures have been used for explanation purposes. However, the invention is not limited to any particular context as shown in drawing figures, and the spirit and scope of the invention include other contexts and applications in which the mechanisms and processes described herein is available to other mechanisms, methods, programs, and processes. Thus, the specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

For example, although not depicted in FIG. 1C, in certain embodiments, policy servers 106, 110 may be coupled to and thus communicate with edge devices 120, 122 via network 128.

In addition, in this disclosure, including in the claims, certain process steps are set forth in a particular order, and alphabetic and alphanumeric labels are used to identify certain steps. Unless specifically stated in the disclosure, embodiments of the invention are not limited to any particular order of carrying out such steps. In particular, the labels are used merely for convenient identification of steps, and are not intended to imply, specify or require a particular order of carrying out such steps. 

1. A method for maintaining consistent per-hop packet forwarding behavior among a plurality of network devices in a network within a Differentiated Services (DS) domain, comprising the computer-implemented steps of: creating and storing a network-wide per-hop forwarding behavior (PHB) definition that associates a PHB with a DS code point (DSCP) value, and with a set of parameters that define the bandwidth and buffer resources allocated to the PHBs on all interfaces of network devices within the DS domain; determining a mapping of one or more of the PHBs in the network-wide PHB definition to one or more queues of the network devices; determining a drain size and a queue size for each of the queues to which PHBs are mapped; determining a mapping of each of the PHBs to a threshold value associated with the queues; and determining parameters of fragmentation and interleave mechanism for the device interfaces.
 2. A method as recited in claim 1, further comprising the step of sending network device configuration parameter values based on the mappings, the drain size, the queue size, and the parameters to each of the network devices within the DS domain.
 3. A method as recited in claim 1, wherein the network-wide service definition further associates a service with a buffer value that identifies a percentage of buffer space that the associated service is assured to receive in any of the network devices.
 4. A method as recited in claim 1, wherein the network-wide service definition further associates a service with an excess action value that identifies a default treatment that the network devices enforce on traffic of the associated service that exceeds an associated rate value.
 5. A method as recited in claim 1, wherein creating and storing a network-wide service definition comprises the steps of: creating and storing a network-wide service definition that associates a first service and a second service in a forwarding class each with a DS code point (DSCP) value, with a minimum bandwidth value, and with an excess action value that identifies a default treatment that the network devices enforce on traffic of the associated service that exceeds an associated rate value, wherein the excess action value of the second service specifies that the network devices shall re-mark packets of the first service to indicate the second service.
 6. A method as recited in claim 5, wherein a single minimum bandwidth value is specified in the network-wide service definition for both the first service and the second service.
 7. A method as recited in claim 6, wherein the buffer value for the first service is less than the buffer value for the second service.
 8. A method as recited in claim 1, wherein the step of determining a mapping of one or more of the services in the network-wide service definition to one or more queues of the network devices comprises the steps of mapping all services within the same forwarding class to the same queue, such that a maximum number of queues is equal to a number of the forwarding classes.
 9. A method as recited in claim 1, wherein the step of determining a drain size and a queue size for each of the queues to which services are mapped comprises the steps of: receiving a packet size value representing a typical packet size in bytes; receiving a total queue size value representing a total queue length in packets on all interfaces; determining the queue size value in proportion to a sum of buffer values of all services that are mapped to the associated queue as compared to the total queue size value; determining the drain size value, wherein a queue having a smallest associated transmit value is assigned a drain size value equal to the packet size and all other queues are assigned a drain size value in proportion to a sum of all transmit values of all services that are mapped to the associated queue.
 10. A method as recited in claim 1, wherein mapping services to thresholds comprises the steps of mapping each service among a plurality of services within a forwarding class to different thresholds; and mapping services that require that packets are not discarded even if delayed to a higher threshold when there are not enough queues to separate among flows of services.
 11. A method as recited in claim 1, wherein determining threshold position values comprises the steps of: for any forwarding class having more than one service, determining the threshold positions based on the buffer values of one service in the forwarding class that has an excess action value indicating remarking.
 12. A method as recited in claim 1, wherein determining threshold position values comprises the steps of determining a first threshold position value for a first service of a forwarding class and determining a second threshold position value for a second service of the forwarding class based on a ratio of the buffer values of the first service and the second service.
 13. A computer-readable medium comprising one or more sequences of instructions for maintaining consistent per-hop packet forwarding behavior PHB among a plurality of network devices in a network within a Differentiated Services (DS) domain, which instructions, when executed by one or more processors, cause the one or more processors to carry out the steps of: creating and storing a network-wide PHB definition that associates a PHB with a DS code point (DSCP) value, and with a set of parameters that define the bandwidth and buffer resources allocated to the PHBs on all interfaces of network devices within the DS domain; determining a mapping of one or more of the PHBs in the network-wide service definition to one or more queues of the network devices; determining a drain size and a queue size for each of the queues to which PHBs are mapped; determining a mapping of each of the PHBs to a threshold value associated with the queues; and determining parameters for fragmentation and interleave mechanisms for the device interfaces.
 14. The computer-readable medium as recited in claim 14 further comprising the step of sending network device configuration parameter values based on the mappings, the drain size, the queue size, and the parameters to each of the network devices within the DS domain.
 15. An apparatus for maintaining consistent per-hop packet forwarding behavior PHB among a plurality of network devices in a network within a Differentiated Services (DS) domain, comprising: means for creating and storing a network-wide PHB definition that associates a PHB with a DS code point (DSCP) value, and with a set of parameters that define the bandwidth and buffer resources allocated to the PHBs on all interfaces of network devices within the DS domain; means for determining a mapping of one or more of the PHBs in the network-wide PHB definition to one or more queues of the network devices; means for determining a drain size and a queue size for each of the queues to which PHBs are mapped; means for determining a mapping of each of the PHBs to a threshold value associated with the queues; and means for determining parameters of fragmentation and interleave mechanisms for the device interfaces.
 16. The apparatus recited in claim 15, further comprising means for sending network device configuration parameter values based on the mappings, the drain size, the queue size, and the parameters to each of the network devices within the DS domain.
 17. An apparatus for maintaining consistent per-hop packet forwarding behavior PHB among a plurality of network devices in a network within a Differentiated Services (DS) domain, comprising: a processor; a network interface coupled to the processor for communicating data packets among the network devices and the processor; a memory having one or more stored sequences of instructions which, when executed by the processor, cause the processor to carry out the steps of: creating and storing a network-wide PHB definition that associates a PHB with a DS code point (DSCP) value, and with a set of parameters that define the bandwidth and buffer resources allocated to the PHBs on all interfaces of network devices within the DS domain; determining a mapping of one or more of the PHBs in the network-wide PHB definition to one or more queues of the network devices; determining a drain size and a queue size for each of the queues to which PHBs are mapped; determining a mapping of each of the PHBs to a threshold value associated with the queues; and determining parameters of fragmentation and interleave mechanism for the device interfaces.
 18. The apparatus recited in claim 17, further comprising sending network device configuration parameter values based on the mappings, the drain size, and the queue size, and the parameters to each of the network devices within the DS domain.
 19. A method for maintaining consistent per-hop packet forwarding behavior PHB among a plurality of network devices in a network within a Differentiated Services (DS) domain, comprising the computer-implemented steps of: creating and storing a network-wide PHB definition that associates a PHB with a DS code point (DSCP) value, and with a set of parameters that define the bandwidth and buffer resources allocated to the PHBs on all interfaces of network devices within the DS domain; mapping of each of the PHBs in the network-wide PHB definition to one or more queues of the network devices; creating and storing one or more network device configuration parameter values based on the mapping; sending the configuration parameter values to each of the network devices within the DS domain. 